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Abstract — Energy consumption represents a significant cost in 
data center operation. A large fraction of the energy, however, is 
used to power idle servers when the workload is low. Dynamic 
provisioning techniques aim at saving this portion of the energy, 
by turning off unnecessary servers. In this paper, we explore 
how much performance gain can knowing future workload 
information brings to dynamic provisioning. In particular, we 
study the dynamic provisioning problem under the cost model 
that a running server consumes a fixed amount energy per 
unit time, and develop online solutions with and without fu- 
ture workload information available. We first reveal an elegant 
structure of the off-line dynamic provisioning problem, which 
allows us to characterize and achieve the optimal solution in 
a "divide-and-conquer" manner. We then exploit this insight to 
design three online algorithms with competitive ratios 2 - a , 
(e - a) / (e - I) X 1.58 - a/ (e - 1) and <?/(<?-!+ a), respectively, 
where < a < 1 is the fraction of a critical window in 
which future workload information is available. A fundamental 
observation is that future workload information beyond the critical 
window will not improve dynamic provisioning performance. Our 
algorithms are decentralized and are simple to implement. We 
demonstrate their effectiveness in simulations using real-world 
traces. We also compare their performance with state-of-the-art 
solutions. 



I. Introduction 

As Internet services, such as search and social network- 
ing, become more widespread in recent years, the energy 
consumption of data centers has been skyrocketing. In 2005, 
data centers worldwide consumed an estimated 152 billion 
kilowatt-hours (kWh) of energy, roughly 1 % of the world total 
energy consumption [1]. Power consumption at such level was 
enough to power half of Italy li2J. Energy cost is approaching 
overall hardware cost in data centers |[3l, and is growing 12% 
annually f4|. 

Recent works have explored electricity price fluctuation in 
time and geographically load balancing across data centers to 
cut short the electricity bill; see e.g., 15], 0, Q, E) and 
the references therein. Meanwhile, it is nevertheless critical to 
minimize the actual energy footprint in individual data centers. 

Energy consumption in a data center is a product of the 
and the energy consumed by the servers. There have 
been substantial efforts in improving PUE, e.g., by optimizing 
cooling IS), ifTOll and power management ifTTIl . We focus on 
reducing the energy consumed by the servers in this paper. 

Real-world statistics reveals three observations that suggest 
ample saving is possible in server energy consumption lfT2l . 

' Power usage effectiveness (PUE) is defined as the ratio between the amount 
of power entering a data center and the power used to run its computer 
infrastructure. The closer to one PUE is, the better energy utilization is. 



|fT3l . lfT4l . IfTSl . |fT6l . IflTl . First, workload in a data center 
often fluctuates significantly on the timescaie of hours or 
days, expressing a large "peak-to-mean" ratio. Second, data 
centers today often provision for far more than the observed 
peak to accommodate both the predictable workload and 
the unpredictable flash crowdfl Such static over-provisioning 
results in low average utilization for most servers in data 
centers. Third, a low-utilized or idle server consumes more 
than 60% of its peak power. These observations imply that a 
large portion of the energy consumed by servers goes into 
powering nearly-idle servers, and it can be best saved by 
turning off servers during the off'-peak periods. 

One promising technique exploiting the above insights is 
dynamic provisioning, which turns on a minimum number of 
servers to meet the current demand and dispatches the load 
among the running servers to meet Service Level Agreements 
(SLA), making the data center "power-proportional". 

There have been a significant amount of efforts in develop- 
ing such technique, initiated by the pioneering works lfT2l ifTSl 
a decade ago. Among them, one line of works ifTSl . ifTSl . 
Ifl4i exam the practical feasibility and advantage of dynamic 
provisioning using real-world traces, suggesting substantial 
gain is indeed possible in practice. Another line of works lfT2l . 
mi, 1201 . llT4l focus on developing algorithms by utilizing 
various tools from queuing theory, control theory, and ma- 
chine learning, providing algorithmic insights in synthesizing 
effective solutions. These existing works provide a number 
of schemes that deliver favorable performance justified by 
theoretic analysis and/or practical evaluations. See ET\ for 
a recent survey. 

The effectiveness of these exciting schemes, however, usu- 
aUy rely on being able to predict future workload to certain 
extent, e.g., using model fitting to forecast future workload 
from historical data lfT4l . This naturally leads to the following 
questions: 

. Can we design online solutions that require zero future 
workload information, yet still achieve close-to-optimal 
performance? 

. Can we characterize the benefit of knowing future work- 
load in dynamic provisioning? 

Answers to these questions provide fundamental understanding 
on how much performance gain one can have by exploiting 
future workload information in dynamic provisioning. 

Recently, Lin et al. Il20l propose an algorithm that requires 



^In May 2011, Amazon's data center is down for hours due to a surge 
downloads of Lady Gaga's song "Bom This Way". 
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almost-zero future workload informatior0 and achieves a com- 
petitive ratio of 3, i.e., the energy consumption is at most 3 
times the minimum (computed with perfect future knowledge). 
In simulations, they further show the algorithm can exploit 
available future workload information to improve the perfor- 
mance. These results are very encouraging, indicating that a 
complete answer to the questions is possible. 

In this paper, we further explore answers to the questions, 
and make the following contributions: 

. We consider a scenario where a running server consumes 
a fixed amount energy per unit time. We reveal that the 
dynamic provisioning problem has an elegant structure 
that allows us to solve it in a "divide-and-conquer" 
manner. This insight leads to a full characterization of 
the optimal solution, achieved by using a centralized 
procedure. 

• We show that, interestingly, the optimal solution can 
also be attained by the data center adopting a simple 
last-empty-server-first job-dispatching strategjO and each 
server independently solving a classic ski-rental prob- 
lem. We build upon this architectural insight to design 
three decentralized online algorithms, all have improved 
competitive ratios than state-of-the-art solutions. One is 
a deterministic algorithm with competitive ratio 2 - a, 
where < a < 1 is the fraction of a critical window in 
which future workload information is available. The other 
two are randomized algorithms with competitive ratios 
(e-a)/(e-l) ~ 1.58 - a/ (e - l)and e/{e-l-\-a), 
respectively. We prove that 2-a and e/ (e - 1 -H o-) are the 
best competitive ratios for deterministic and randomized 
online algorithms under our last-empty-server-first job- 
dispatching strategy. 

. Our results lead to a fundamental observation: under 
the cost model that a running server consumes a fixed 
amount energy per unit time,/MfMre workload information 
beyond the critical window will not improve the dynamic 
provisioning performance. The size of the critical window 
is determined by the wear-and-tear cost and the unit-time 
energy cost of running one server. 

• Our algorithms are simple and easy to implement. We 
demonstrate the effectiveness of our algorithms in sim- 
ulations using real-world traces. We also compare their 
performance with state-of-the-art solutions. 

The rest of the paper is organized as follows. We formulate 
the problem in Section |ll] Section |lll] reveals the important 
structure of the formulated problem, characterizes the optimal 
solution, and designs a simple decentralized offline algorithm 
achieving the optimal. In Section |IV] we propose the online 
algorithms and provide performance guarantees. Section |V] 
presents the numerical experiments and Section [Vl] concludes 
the paper. 

'The LCP algorithm proposed in 1201 only relies on an estimate of the job 
arrival rate of the upcoming slot. 

^Readers might notice that this job-dispatching strategy shares some 
similarity with the most-recently-busy strategy used in the DELAYEDOFF 
algorithm (221 . Actually there are subtle yet important difference, which will 
be discussed in details in Section HV-DI 



II. Problem Formulation 
A. Settings and Models 

We consider a data center consisting of a set of homoge- 
neous servers. Without loss of generality, we assume each 
server has a unit service capacitjQ i.e., it can only serve one 
unit workload per unit time. Each server consumes P energy 
per unit time if it is on and zero otherwise. We define Pon and 
Poff as the cost of turning a server on and off, respectively. 
Such wear-and-tear cost, including the amortized service in- 
terruption and hard-disk failure cost |fT9l , is comparable to the 
energy cost of running a server for several hours ll20l . 

The results we develop in this paper apply to both of the 
following two types of wor kloacS 

. "mice" type of workload, such as "request-response" web 
serving. Each job of this type has a small transaction size 
and short duration. A number of existing works lfT2l . ifTSll . 
1201 . ||231 model such workload by a discrete-time fluid 
model. In the model, time is chopped into equal-length 
slots. Jobs arriving in one slot get served in the same slot. 
Workload can be split among running servers at arbitrary 
granularity like fluid. 
. "elephant" type of workload, such as virtual machine 
hosting in cloud computing. Each job of this type has 
a large transaction size, and can last for a long time. We 
model such workload by a continuous-time brick model. 
In this model, time is continuous, and we assume one 
server can only serve one jot0. Jobs arrive and depart 
at arbitrary time, and no two job arrival/departure events 
happen simultaneously. 
For the discrete-time fluid model, servers toggled at the 
discrete time epoch will not interrupt job execution and thus no 
job migration is incurred. This neat abstraction allows research 
to focus on server on-off scheduling to minimize the cost. 
For the continuous-time brick model, when a server is turned 
off", the long-lasting job running on it needs to be migrated 
to another server In general, such non-trivial migration cost 
needs to be taken into account when toggling servers. 

In the following, we present our results based on the 
continuous-time brick model. We add discussions to show the 
algorithms and results are also applicable to the discrete-time 
fluid model. 

Let X (t) and a (f) be the number of "on" servers (serving 
or idle) and jobs at time t in the data center, respectively. 
To keep the problem interesting, we assume that a (f) is not 
always zero. Under our workload model, a{t) at most increases 
or decreases by one at any time t. 

To focus on the cost within [0, T], we set x(0) - a (0) and 
X (T) = a (T). Note such boundary conditions include the one 
considered in the literature, e.g., 1201 . as a special case, where 
x(0) = fl(0) = x(T) = a(T) = 0. 

^In practice, server's service capacity can be determined from the knee of 
its throughput and response-time curve |15l . 

^There are also other types of workload, such as the bin-packing model 
considered in 1151 . Extending the results in this paper to those workload 
models is of great interest and left for future work. 

'other than the obvious reason that the service capacity can only fit one 
job, there could also be SLA in cloud computing that requires the job does 
not share the physical server with other jobs due to security concerns. 
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Let PonihJi) and PoffitiJi) denote the total wear-and- 
tear cost incurred by turning on and off servers in [fi,f2], 
respectively: 

( r('2-/i)/<5i -\ 

(1) 

and 

( r(f2-'i)/<51 

Poffituti)^ jim I Poff [x(ti+(i-l)S)-x(ti+id)Y 
\ 1=1 

(2) 



B. Problem Formulation 

We formulate the problem of minimizing server operation 
cost in a data center in [0, T] as follows: 

T 

SCP: min P j x(t)dt + P„„{0,T) + PoffiO,T) (3) 



s.t. x{t) > a{t),Vt € [Q,T], 

x(0) = a{0), x{T) = a{T), 
var jc(f) €Z+,f e [0,r], 



(4) 
(5) 
(6) 



where denotes the set of non-negative integers. 

The objective is to minimize the sum of server energy 
consumption and the wear-and-tear cost. Constraints in (|4]i 
say the service capacity must satisfy the demand. Constraints 
in (|5j are the boundary conditions. 

Remarks: (i) The problem SCP does not consider the 
possible migration cost associated with the continuous-time 
discrete-load model. Fortunately, our results later show that we 
can schedule servers according to the optimal solution, and at 
the same time dispatch jobs to servers in a way that aligns with 
their on-off schedules, thus incurring no migration cost. Hence, 
the minimum server operation cost remains unaltered even we 
consider migration cost in the problem SCP (which can be 
rather complicated to model), (ii) The formulation remains 
the same with discrete-time fluid workload model where there 
is no job migration cost to consider, (iii) The problem SCP 
is similar to a common one considered in the literature, e.g., 
in 1201 . with a specific cost function. The difference is that 
we allow more flexible boundary conditions and on/off wear- 
and-tear cost modeling, and are more precise in the decision 
variables being integers instead of real numbers. (iv) In the 
problem setting, we assume that the power consumption of a 
server is constant P. Actually, the results of this paper also 
apply to the following unit time power consumption model: 
the power consumption of x busy server is F (x) and the 
unit time power consumption for a idle server is P. This 
is because the total power consumption under this model is 

F [a (f)] +P[x(t)-a (f)] dt + Po„(0, T) + Poff{0, T). Since 
Jq F {a {t)\-Pa (t) dt is constant for given a (f), to minimize the 
total power consumption is to minimize above SCP problem. 

There are infinite number of integer variables x{t), t e 
[0, T], in the problem SCP, which make it challenging to 
solve. Moreover, in practice the data center has to solve the 
problem without knowing the workload a(f), t e [0, T] ahead 
of time. 



Next, we first focus on designing off'-line solution, includ- 
ing (i) a job-dispatching algorithm and (ii) a server on-off' 
scheduling algorithm, to solve the problem SCP optimally. 
We then extend the solution to its on-Une versions and analyze 
their performance guarantees with or without (partial) future 
workload information. 

III. Optimal Solution and Offline Algorithm 

We study the off-line version of the server cost minimization 
■ problem SCP, where the workload a{t) in [0, T] is given. 

We first identify an elegant structure of its optimal solution, 
which allows us to solve the problem in a "divide-and- 
conquer" manner. That is, to solve the problem SCP in [0, T], 
it suffices to split it into smaller problems over certain critical 
segments and solve them independently. We then derive a 
simple and decentralized algorithm, upon which we build our 
online algorithms. 

A. Critical Times and Critical Segments 

Given a{t) in [0, T], we identify a set of critical times (T",' }. 
and construct the critical segments as follows. 

Critical Segment Construction Procedure: 

First, traversing a{t), we identify all the jobs ar- 
rival/departure epochs in [0, T]. The first critical time is 
rj^ = 0. rj can be a job-arrival epoch or job-departure epoch, 
or no job departs/arrive the system at TJ'. If no job departs or 
arrives at Tp r[ is considered as a job-arrival epoch. Next we 
find r^i^j inductively, given that T'' is known. 

. If r[ is a job-arrival epoch, e.g., the first critical time, 
then T^^y is the first job-departure epoch after r[. One 
example is the epoch in Fig. [T] 
. If is a job-departure epoch, we first try to find the first 
arrival epoch t after so that a{T) - a {rf^. If such t 
exists, then we set TP^j = t. One example is the epoch 
r^' in Fig. [U If no such t exists, and we set T^^j to be 
the next job departure epoch. One example is the in 
Fig.ID 

Upon reaching time epoch T, we find all, say M, critical times. 
We define the critical segments as the period between two 
consecutive critical times, i.e., T^-^J, 1 < i < M - \. 

The critical segments have interesting properties. For ex- 
ample, they are disjoint except at the boundary points, and 
they together fully cover the time interval [0, T]. Moreover, 
we observe that workload expresses interesting properties in 
these critical segments. 

Proposition 1. The workload ait) in any critical segment 
l^ri^, r'^jj must be one of the following four types: 



Type-I: workload is non-decreasing in 
Type-II: workload is step-decreasing in 



. That is. 



/ I ' i+1 

fl (f) = fl (rf)- 1, Vf e (rf , and a (f) < a (r;)- 1, Vf e 

Type-Ill: workload is of "U-shape" in l^r^"^, T":^^. That is, 
« = « ij'i) and a{t)^a (rf) - 1, W e (r,'', tQ. 
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rf = T| T§ T| T§ r| Ty'^ = T 

Figure 1: Illustration of critical times and critical segments. TJ' 
to Tij are critical times, and they form six critical segments. 
fl(f) is of Type-I in [t\, T^], Type-II in [r^, 7^3^], Type-Ill in 
[^,T',\ andType-IV in [^,Tl]. 

. Type-IV: workload is of "canyon-shape" in ^Tf, Tf_^^^. 
That is, a (t^^i^ = a (Tf^, a(t) < a {rf^-l and not always 
identical, Vf € {j":, TF^ J. 

Proof: Refer to Appendix [A] ■ 
Examples of these four types of a{t) are shown in Fig. [1] 

B. Structure of Optimal Solution 

Let x*(f), t 6 [0, T], be an optimal solution to the problem 
SCP, and the corresponding minimum server operation cost 
be P*. We have the following observation. 

Lemma 2. x* {t) must meet a{f) at every critical time, i.e., 
X* (rf) = a (rf), \<i<M. 

Proof: Refer to Appendix |B] ■ 
Lemma |2] not only presents a necessary condition for a 
solution x{t) to be optimal, but also suggests a "divide-and- 
conquer" way to solve the problem SCP optimally. 

Consider the following sub-problem of minimizing server 
operation cost in a critical segment Jt!^, T':^^, 1 < i < M - \ : 

min P ^ X (r) dt + P„„ [Tf, Tl^) + P^ff [Tf, Tl,) (7) 



s.t. 40>«(0,Vf e[rf,r,y , (8) 

x{r,)^a(T';),x{Tl,) = a{Tl,), (9) 
var x{t)eZ^,te\r':,Tl^\. (10) 



Let its optimal value be f?, 1 < / < M - 1. We have the 
following observation. 

M 

Lemma 3. Y^P* lower bound of the optimal server 

1=1 

operation cost of the problem SCP, i.e., 

M 

r>Y,P'i- (11) 

1=1 

Proof: Refer to Appendix |C] ■ 
Remark: Over arbitrarily chopped segments, sum of their 
minimum server operation costs may not be bounds for P* . 
However, as we will see later, computed based on critical 
segments, Eqn. ( fTTT i establishes a lower bound of P* and is 



achievable, thanks to the structure of x* (t) outlined in Lemma 

in 

Suggested by Lemma [3] it suffices to solve individual 
sub-problems for all critical segments in [0,7], and com- 
bine the corresponding solutions to form an optimal so- 
lution to the overall problem SCP (note the optimal so- 
lutions of sub-problems connect seamlessly). The special 
structures of a{t) in individual critical segment, summarized 
in Proposition [T] are the key to tackle each sub-problem. 



Optimal Solution Construction Procedure: 

We visit all the critical segments in [0, T] sequentially, and 
construct an x{t), t € [0, T]. For a critical segment 
1 < / < M - 1, we check the a(t) in it: 

1) the a(t) is of Type-I or Type-II: we simply set x{t) - a{t), 
forallf6[r[,r!;,]. 

2) the fl(f) is of Type-Ill: 

. if Pon +/3off > P ■ (rF^i - rf), then we set xit) = 

fl(r[),Vfe[rf-,r!;J; 

• otherwise, we set x{Tf) = a{Tf), x(Tf^-^) - a(T'f^j), 
and x{t)^a (rf ) - 1, Vf € (rf , T^J ' 

3) the a(t) is of Type-IV: 

. if /3„n +Poff>P- [Tf^i - rf), then we set x(t) = 

a(r'LO,W€[rf,r;J; 

. Otherwise, we construct x(t) as follows. In Type- 
IV critical segment, each job-departure epoch t in 
rf, rfi^jj has a corresponding job-arrival epoch r 

in [rf,r,f;,] such that a (r) = ^(t') and a{t) < 
a(T),Vf e (t,t'). Finding the first job-departure 
epoch Ti after rf in |^rf , rf^j j who has a correspond- 
ing job-arrival epoch t'j such that /3o„ + y6o// ^ P ■ 
(tj - Ti). Then finding the first job-departure epoch 
T2 after Tj who has a corresponding job-arrival 
epoch such that p„„ + p,,// > r* • (tj - T2). Go on 
this way until we reach r^^j. Upon reaching time 
epoch rf^j, we find all, say L, such job-departure 
and arrival epoch pairs (ti, t'j),(t2, T2)...(t£, r^). If 
L - 0, which means there does not exist such job- 
departure and arrival epoch pair, we set x{t) = 
a{t),St e [T-',T':^i\, otherwise, we set x(t) - 
a{i),'it e [rf,Ti) U (t;,T2) U ... U (T;,r^J and 
x(t) - fl(T;) , Vf e [T;,T^j for 1,2, ....L. 



The following theorem shows that the lower bound of P* 
in (fTTT i is achieved by using the above procedure. 

Theorem 4. The Optimal Solution Consti-uction Procedure 

terminates in finite time, and the resulting x (f), f e [0, T], is 
an optimal solution to the problem SCP. 

Proof: Refer to Appendix |D] ■ 

The proof utilizes proof-by-contradiction and counting ar- 
guments. 
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Figure 2: An example of a critical segment [0, T] (after 
offsetting the time origin to the beginning of the segment) 
with Type-IV a(t). This critical segment is further decomposed 
into smaller critical segments [Tj, Tj], [Tj, r^'], and [T'^,T^]. 
Interval 6i = - T'^, 62 = ^ - T[, and ^3 = - T'^. ^ 



C. Intuitions and Observations 

Constructing optimal x{t) for critical segments with Type- 
I/II/III workload is rather straightforward. In the following, we 
go through the construction of x{t) for the critical segment with 
Type-IV workload shown in Fig. |2] to bring out the intuition. 
We define 

as the critical interval over which the energy cost of main- 
taining an idle server matches the cost of turning it off at the 
beginning of the interval and turning it on at the end of the 
interval. 

During the critical segment [0, T] with Type-IV workload 
shown in Fig. |2] the system starts and ends with 2 jobs and 2 
running servers. Let the servers with their jobs leaving at time 

and be SI and S2, respectively. 

At time 0, a job leaves. The procedure compares A and 
r. If A > T, then it sets x{t) - 2 and keeps all two servers 
running for all t e [0, T]; otherwise, it further applies the 
Critical Segment Construction Procedure and decomposes 
the critical segment into three small ones [TJ^, T^], [T^, T^], and 
[T^, T^], as shown in Fig. |2l The first small critical segment 
[Tp T^] has a Type-II workload, thus the procedure sets x(t) = 

1 for / e [r[,r2]. The second small segment [rj, T^] has a 
Type-Ill workload; thus for all t e [T!^, T'^], the procedure 
maintains x{t) = 1 if A > 5i and sets x{t) - otherwise. The 
last small segment [T^,T^] has a Type-I workload, thus the 
procedure set x{t) - 1 for t e [T!^, T^') and x{T^^) = 2. 

These actions reveal two important observations, upon 
which we build a decentralized off'-line algorithm to solve the 
problem SCP optimally. 

. Newly arrived jobs should be assigned to servers in the 
reverse order of their last-empty-epochs. 
In the example, when a new job arrives at time T^, the 
procedure implicitly assigns it to server S2 instead of SI. 
As a result, SI and S2 have empty periods of T and 61, 
respectively. This may sound counter-intuitive as compared to 
an alternative "fair" strategy that assigns the job to the early- 
emptied server SI, which gives SI and S2 empty periods of 62 
and ^3, respectively. Different job-dispatching gives different 



empty-period distribution. It turns out a more skew empty- 
period distribution leads to more energy saving. 

The intuition is that job-dispatching should try to make 
every server empty as long as possible so that the on-off 
option, if explored, can save abundant energy. 

• Upon being assigned an empty period, a server only needs 
to independently make locally energy-optimal decision. 
It is straightforward to verify that in the example, upon a job 
leaving server SI at time 0, the procedure implicitly assigns 
an empty -period of T to SI, and turns SI off^ if A < T and 
keeps it running at idle state otherwise. Similarly, upon a job 
leaving S2 at time T^, S2 is turned off" if A < 5\ and stays 
idle otherwise. Such comparisons and decisions can be done 
by individual servers themselves. 

D. Offline Algorithm Achieving the Optimal Solution 

The Optimal Solution Construction Procedure deter- 
mines how many running servers to maintain at time f, i.e., 
x*(f), to achieve the optimal server operation cost P* . However, 
as discussed in Section III-AI under the continuous-time brick 
model, scheduling servers on/off according to x*(t) might incur 
non-trivial job migration cost. 

Exploiting the two observations made in the case-study at 
the end of last subsection, we design a simple and decentral- 
ized off-line algorithm that gives an optimal x*{t) and incurs 
no job migration cost. 

Decentralized Off-line Algorithm AO: 
By a central job-dispatching entity: it implements a last- 
empty-server-first strategy. In particular, it maintains a stack 
(i.e., a Last-In/First-Out queue) storing the IDs for all idle or 
off' servers. Before time 0, the stack contains IDs for all the 
servers that are not serving. 

. Upon a job arrival: the entity pops a server ID from the 
top of the stack, and assigns the job to the corresponding 
server (if the server is off, the entity turns it on). 
. Upon a job departure: a server just turns idle, the entity 
pushes the server ID into the stack. 
By each server: 

. Upon receiving a job: the server starts serving the job 
immediately. 

. Upon a job leaving this server and it becomes empty: let 
the current time be f 1 . The server searches for the earliest 
time t2 s {t\,ti + A] so that a{t2) - a(fi). If no such t2 
exists, then the server turns itself off. Otherwise, it stays 
idle. 

We remark that in the algorithm, we use the same server to 
serve a job during its entire sojourn time. Thus there is no job 
migration cost. The following theorem justifies the optimality 
of the off-line algorithm. 

Theorem 5. The proposed off-line algorithm AO achieves the 
optimal server operation cost of the problem SCP. 

Proof: Refer to Appendix |E] ■ 
There are two important observations. First, the job- 
dispatching strategy only depends on the past job arrivals and 
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departures. Consequently, the strategy assigns a job to the 
same server no matter it knows future job arrival/departure or 
not; it also acts independently to servers' ofF-or-idle decisions. 
Second, each individual server is actually solving a classic ski- 
rental problem 1241 - whether to "rent", i.e., keep idle, or to 
"buy", i.e., turn off now and on later, but with their "days- 
of-skiing" (corresponding to servers' empty periods) jointly 
determined by the job-dispatching strategy. 

Next, we exploit these two observations to extend the off- 
line algorithm AO to its online versions with performance 
guarantee. 

IV. Online Dynamic Provisioning with or without Future 
Workload Information 

Inspired by our off-line algorithm, we construct online 
algorithms by combining (i) the same last-empty-server-first 
job-dispatching strategy as the one in algorithm AO, and (ii) 
an off-or-idle decision module running on each server to solve 
an online ski-rental problem. 

As discussed at the end of last section, the last-empty- 
server-first job-dispatching strategy utilizes only past job ar- 
rival/departure information. Consequently, as compared to the 
offline case, in the online case it assigns the same set of jobs to 
the same server at the same sequence of epochs. The following 
lemma rigorously confirms this observation. 

Lemma 6. For the same a(t) ,t e [0, T], under the last-empty- 
server-first job-dispatching strategy, each server will get the 
same job at the same time and the job will leave the server 
at the same time for both ojf-line and online situations. 

Proof: Refer to Appendix 10 ■ 
As a result, in the online case, each sen'er still faces the 
same set of off-or-idle problems as compared to the off-line 
case. This is the key to derive the competitive ratios of our 
to-be-presented online algorithms. 

Each server, not knowing the empty periods ahead of time, 
however, needs to decide whether to stay idle or be off (and 
if so when) in an online fashion. One natural approach is to 
adopt classic algorithms for the online ski-rental problem. 

A. Dynamic Provisioning without Future Workload Informa- 
tion 

For the online ski-rental problem, the break-even algorithm 
in ll24l and the randomized algorithm in ll25l have com- 
petitive ratios 2 and e/ (e - 1), respectively. The ratios have 
been proved to be optimal for deterministic and randomized 
algorithms, respectively. Directly adopting these algorithms 
in the off-or-idle decision module leads to two online so- 
lutions for the problem SCP with competitive ratios 2 and 
e/ (e - 1) ~ 1.58. These ratios improve the best known ratio 3 
achieved by the algorithm in 1201 . 

The resulting solutions are decentralized and easy to im- 
plement: a central entity runs the last-empty-server-first job- 
dispatching strategy, and each server independently runs an 
online ski-rental algorithms. For example, if the break-even 
algorithm is used, a server that just becomes empty at time 
t will stay idle for A amount of time. If it receives no job 



during this period, it turns itself off. Otherwise, it starts to serve 
the job immediately. As a special case covered by Theorem 
|7] it turns out this directly gives a 2-competitive dynamic 
provisioning solution. 

B. Dynamic Provisioning with Future Workload Information 

Classic online problem studies usually assume zero future 
information. However, in our data center dynamic provisioning 
problem, one key observation many existing solutions ex- 
ploited is that the workload expressed highly regular patterns. 
Thus the workload information in a near prediction window 
may be accurately estimated by machine learning or model 
fitting based on historical data HU, l26ll . Can we exploit such 
future knowledge, if available, in designing online algorithms? 
If so, how much gain can we get? 

Let's elaborate through an example to explain why and how 
much future knowledge can help. Suppose at any time r, the 
workload information a{t) in a prediction window [t, t + aA] 
is available, where a 6 [0, 1] is a constant. Consider a server 
running the break-even algorithm just becomes empty at time 
fi, and its empty period happens to be just a bit longer than 
A. 

Following the standard break-even algorithm, the server 
waits for A amount of time before turning itself off. According 
to the setting, it receives a job right after f i 4- A epoch, and it 
has to power up to serve the job. This incurs a total cost of 
2PA as compared to the optimal one PA, which is achieved 
by the server staying idle all the way. 

An alternative strategy that costs less is as follows. The 
server stays idle for (1 - ff) A amount of time, and peeks into 
the prediction window [ti -i- (l - a) A, ti -H A]. Due to the last- 
empty-server-first job-dispatching strategy, the server can easy 
tell that it will receive a job if any a{t) in the window exceeds 
fl(fi), and no job otherwise. According to the setting, the server 
sees itself receiving no job during [?!+(! - a) A, fi + A] and it 
turns itself off at time fi + (1 - a) A. Later it turns itself on to 
serve the job right after fi 4- A. Under this strategy, the overall 
cost is (2 - a) PA and is better than that of the break-even 
algorithm. 

This simple example shows it is possible to modify classic 
online algorithms to exploit future workload information to 
obtain better performance. To this end, we propose new 
future-aware online ski-rental algorithms and build new onUne 
solutions. 

We model the availability of future workload information 
as follows. For any f, the workload a{t) for in the window 
[t,t + a A] is known, where ff e [0, 1] is a constant and a A 
represents the size of the window. 

We present both the modified break-even algorithm and 
the resulting decentralized and deterministic online solution 
as follow. The modified future-aware break-even algorithm 
is very simple and is summarized as the part in the server's 
actions upon job departure. 



Future-Aware Online Algorithm Al: 
By a central job-dispatching entity: it implements the 
last-empty-server-first job-dispatching strategy, i.e., the one 
described in the off-line algorithm. 
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By each server: 

• Upon receiving a job: the server starts serving the job 
immediately. 

• Upon a job leaving this server and it becomes empty: the 
server waits for (1 - a) A amount of time, 

- if it receives a job during the period, it starts serving 
the job immediately; 

- otherwise, it looks into the prediction window of size 
aA. It turns itself off, if it will receive no job during 
the window. Otherwise, it stays idle. 

In fact, as shown in Theorem |7] later in this section, the 
algorithm Al has the best possible competitive ratio for 
any deterministic algorithms under the last-empty-server-first 
job-dispatching strategy. Thus, unless we change the job- 
dispatching strategy, no deterministic algorithms can achieve 
better competitive ratio than the algorithm Al. 

Similarly, we present both the modified randomized algo- 
rithms for solving online ski-rental problem and the resulting 
decentralized and randomized online solutions as follow. The 
modified future-aware randomized algorithms are also sum- 
marized as the part in the server's actions upon job departure. 
The first randomized algorithm A2 is a direct extension of 
the one in ||25l to make it future-aware. The algorithm A3 
is new and it has the best possible competitive ratio for any 
randmonized algorithms under the last-empty-server-first job- 
dispatching strategy. 

Future-Aware Online Algorithm A2: 

By a central job-dispatching entity: it implements the 

last-empty-server-first job-dispatching strategy, i.e., the one 

described in the off-line algorithm. 

By each server: 

• Upon receiving a job: the server starts serving the job 
immediately. 

• Upon a job leaving this server and it turns empty: the 
server waits for Z amount of time, where Z is generated 
according to the following probabihty density function 



fziz) = 



r p;/(l-»)A 
(f-I)(l-£l-)A' 



0, 



if <z < (1 
otherwise. 



■a) A; 



- if it receives a job during the period, it starts serving 
the job immediately; 

- otherwise, it looks into the prediction window of size 
aA. It turns itself off, if it will receive no job during 
the window. Otherwise, it stays idle. 



Future-Aware Online Algorithm A3: 

By a central job-dispatching entity: it implements the 

last-empty-server-first job-dispatching strategy, i.e., the one 

described in the off-line algorithm. 

By each server: 

• Upon receiving a job: the server starts serving the job 
immediately. 

• Upon a job leaving this server and it turns empty: the 
server waits for Z amount of time, where Z is generated 



according to the following probability distribution 

-e^/d-'rtA^ if 0<z<(l -a)A; 
otherwise. 



P(Z = 0) = 1 - -2— 

- if it receives a job during the period, it starts serving 
the job immediately; 

- otherwise, it looks into the prediction window of size 
aA. It turns itself off, if it will receive no job during 
the window. Otherwise, it stays idle. 



The three future-aware online algorithms inherit the nice 
properties of the proposed off-line algorithm in the previous 
section. The same server is used to serve a job during its 
entire sojourn time. Thus there is no job migration cost. The 
algorithms are decentralized, making them easy to implement 
and scale. 

Observing no such future-aware onUne algorithms available 
in the literature, we analyze their competitive ratios and 
present the results as follows. 

Theorem 7. The deterministic online algorithm Al has a 
competitive ratio of 2 — a. The randomized online algorithm 
A2 achieves a competitive ratio of (e - a) / (e - 1). The ran- 
domized online algorithm A3 achieves a competitive ratio of 
e I (e — I + a). The competitive ratios of the algorithms Al and 
are A3 the best possible for deterministic and randomized 
algorithms, respectively, under the last-empty-server-first job- 
dispatching strategy. 

Proof: Refer to Appendix |F1 ■ 
Remarks: (i) When a - 1, all three algorithms achieve 
the optimal server operation cost. This matches the intuition 
that servers only need to look A amount of time ahead to 
make optimal off-or-idle decision upon job departures. This 
immediately gives a fundamental insight that future workload 
information beyond the critical interval A (corresponding to 
a - V) will not improve dynamic provisioning performance, 
(ii) The competitive ratios presented in the above theorem is 
for the worst case. We have carried out simulations using real- 
world traces and found the empirical ratios are much better, as 
shown in Fig. [3] (iii) To achieve better competitive ratios, the 
theorem says that it is necessary to change the job-dispatching 
strategy, since otherwise no deterministic or randomized al- 
gorithms do better than the algorithms Al and A3, (iv) Our 
analysis assumes the workload information in the prediction 
window is accurate. We evaluate the two online algorithms in 
simulations using real-world traces with prediction errors, and 
observe they are fairly robust to the errors. More details are 
provided in Section |V] 

C. Adapting the Algorithms to Work with Discrete-Time Fluid 
Workload Model 

Adapting our off-line and online algorithms to work with 
the discrete-time fluid workload model involves two simple 
modifications. Recall in the discrete-time fluid model, time is 
chopped into equal-length slots. Jobs arriving in one slot get 
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Figure 3: Comparison of the worst-case competitive ratios 
(according to Theorem |7]i and the empirical competitive ratios 
observed in simulations using real-world traces. The critical 
window size A = 6 units of time. More simulation details are 
in Section N\ 



served in the same slot. Workload can be split among running 
servers at arbitrary granularity like fluid. 

For the job-dispatching entity in all the algorithms, at the 
end of each slot when all servers are considered to be empty, 
it pushes all the server IDs back into the stack (order doesn't 
matter). Then at the beginning of each slot, it pops just-enough 
server IDs from the stack in a Last-In/First-Out manner to 
satisfy the current workload. In this way, the job-dispatching 
entity essentially packs the workload to as few servers as 
possible, following the last-empty-server-first strategy. 

For individual servers, they start to serve upon receiving 
jobs, and start to solve the off'-line or online ski-rental prob- 
lems upon all its jobs leaving and it becomes empty. 

It is not difficult to verify the modified algorithms still retain 
their corresponding performance guarantees. Actually, we have 
following corollary. 

Corollary 8. The modified deterministic and randomized 
online algorithms for discrete-time fluid workload have com- 
petitive ratios of 2 — a, (e — a)/{e — I), and e I {e — \ + a), 
respectively. 

Proof: Refer to Appendix |G] ■ 

D. Comparison with the DELAYEDOFF Algorithm 

It is somewhat surprising to find out our algorithms share 
similar ingredients as the DELAYEDOFF algorithm in ll22ll . 
since these are two independent efforts setting off to optimize 
different objective functions (total energy consumption in our 
study v.s. Energy-Response time Product (ERP) in ll22ll ). 

The DELAYEDOFF algorithm contains two modules. The 
first one is a job-dispatching module that assigns a newly 
arrived job to the most-recently-busy idle server (i.e., the idle 
server who was most recently busy); servers in off-state are 
not included. The second one is a delay-off module running on 
each server that keeps the server idle for some pre-determined 
amount of time, defined as twait, before turning it off. If the 
server gets a job to service in this period, its idle time is 
reset to 0. The authors of |22| show that for any t^att, if the 
job arrival process is Poisson, the DELAYEDOFF algorithm 
minimizes the average ERP of a data center as the load (i.e., 
the ratio between the arrival rate and the average sojourn time) 
approaches infinity. 



Interestingly, if there are idle servers in system, DELAYED- 
OFF and the algorithm Al will choose the same server to serve 
the new job because the most-recently-busy server is indeed 
the last-empty server in this case. If there are no idle servers, 
the algorithm Al will still choose the last-empty server but 
DELAYEDOFF will randomly select an off server to server 
the job. With this observation, the DELAYEDOFF algorithm, 
under the setting f,„„, = A, can be viewed as a variant of a 
special case of the algorithm Al with zero future workload 
information available (i.e., a - 0). It would be interesting 
to see whether the analytical insights used in analyzing the 
DELAYEDOFF algorithm can be used to understand the 
performance of the algorithm Al when the job arrival process 
is Poisson. 

Despite the similarity between the algorithm Al and the 
DELAYEDOFF algorithm, it is not clear what is the competi- 
tive ratio of DELAYEDOFF. Unlink our last-empty-server-first 
job-dispatching strategy, the most-recently-busy idle server 
first strategy does not guarantee a server faces the same set of 
ski-rental problems in the online case as compared to the off- 
line case. Consequently, it is not clear how to relate the online 
cost of the DELAYEDOFF algorithm to the offline optimal 
cost. 

The two job-dispatching strategies differ more when the 
server waiting time is random, e.g., in our algorithms A2 and 
A3, where a later-empty server may turn itself off before an 
early-empty server does; hence, the most-recently-busy (idle) 
server is usually not the last-empty server. We compare the 
performance of algorithms Al, A2, A3, and DELAYEDOFF 
in simulations in Section [V] 



V. Experiments 

We implement the proposed off-line and online algorithms 
and carry out simulations using real-world traces to evaluate 
their performance. Our purposes are threefold. First, to evalu- 
ate the performance of the algorithms using real-world traces. 
Second, to study the impacts of workload prediction error and 
workload characteristic on the algorithms' performance. Third, 
to compare our algorithms to two recently proposed solutions 
LCP(w) in EO) and DELAYEDOFF in El . 



A. Settings 

Workload trace: The real-world traces we use in experi- 
ments are a set of I/O traces taken from 6 RAID volumes 
at MSR Cambridge 1271 . The traced period was one week 
between February 22 to 29, 2007. We estimate the average 
number of jobs over disjoint 10 minute intervals. The data 
trace has a peak-to-mean ratio (PMR) of 4.63. The jobs 
are "request-response" type and thus the workload is better 
described by a discrete-time fluid model, with the slot length 
being 10 minutes and the load in each slot being the average 
number of jobs. 

As discussed in Section ITV-CI the proposed off-line and on- 
line algorithms also work with the discrete-time fluid workload 
model after simple modification. In the experiments, we run 
the modified algorithms using the above real-world traces. 
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Figure 4: Real-world workload trace and performance of the algorithms under different situations. 



Cost benchmark: Current data centers usually do not use 
dynamic provisioning. The cost incurred by static provisioning 
is usually considered as benchmark to evaluate new algorithms 
1201 . ifTSll . Static provisioning runs a constant number of 
servers to serve the workload. In order to satisfy the time- 
varying demand during a period, data centers usually overly 
provision and keep more running servers than what is needed 
to satisfy the peak load. In our experiment, we assume that 
the data center has the complete workload information ahead 
of time and provisions exactly to satisfy the peak load. Using 
such benchmark gives us a conservative estimate of the cost 
saving from our algorithms. 

Sever operation cost: The server operation cost is deter- 
mined by unit-time energy cost P and on-off costs li„„ and 
Poff- In the experiment, we assume that a server consumes one 
unit energy for per unit time, i.e., P - \. We set Poff+Pon - 6, 
i.e., the cost of turning a server off and on once is equal to 
that of running it for six units of time ll20l . Under this setting, 
the critical interval is A = (ySo// + Pon) IP - d units of time. 

B. Performance of the Proposed Online Algorithms 

We have characterized in Theorem [T] the competitive ratios 
of our proposed online algorithms as the prediction window 
size, i.e., o-A, increases. The resulting competitive ratios, i.e., 
2 - ff, (e - a) / (e - 1) and ej {e - \ + a), already appealing, 
are for the worst-case scenarios. In practice, the actual perfor- 
mance can be even better. 

In our first experiment, we study the performance of our 
online algorithms using real-world traces. The results are 
shown in Fig. |4b] The cost reduction curves are obtained by 
comparing the power cost incurred by the off-line algorithm, 
the three online algorithms, the LCP(w) algorithm I20II and the 
DELAYEDOFF algorithm |l22| to the cost benchmark. The 
vertical axis indicates the cost reduction and the horizontal 
axis indicates the size of prediction window varying from 
to 10 units of time. 

As seen, for this set of workload, both our three online 
algorithms, LCP(w) and DELAYEDOFF achieve substantial 
cost reduction as compared to the benchmark. In particular, 
the cost reductions of our three online algorithms are beyond 
66% even when no future workload information is available; 
while LCP(w) has to have (or estimate) one unit time of future 
workload to execute, and thus it starts to perform when the 
prediction window size is one. The cost reductions of our 



three online algorithms grow linearly as the prediction window 
increases, and reaching optimal when the prediction window 
size reaches A. These observations match what Theorem [T] 
predicts. Meanwhile, LCP(w) has not yet reach the optimal 
performance when the prediction window size reaches the 
critical value A. DELAYEDOFF has the same performance 
for all prediction window sizes since it does not exploit future 
workload information. 

As seen in Fig. |4b] in the simulation, our three algorithms 
can achieve the optimal power consumption when the size of 
prediction window is 5, one unit smaller than the theoretically- 
computed one A = 6. At first glance, the results seem 
not aligned with what the analysis suggests. But a careful 
investigation reveals that there is no mis-alignment between 
analysis and simulation. Because jobs are assigned to servers 
at the beginning of each slots in discrete-time fluid model, 
knowing the workload from current time to the beginning 
of the 5th look-ahead future slot is equivalent to knowing 
the workload of a duration of 6 slots. Hence, the anaysis 
indeed suggests Algorithms A1-A3 can achieve optimal power 
consumption when the size of prediction window is 5, as 
observed in Fig. |4b] 

C. Impact of Prediction Error 

Previous experiments show that both our algorithms and 
LCP(w) have better performance if accurate future workload 
is available. However, there are always prediction errors in 
practice. Therefore, it is important to evaluate the performance 
of the algorithms in the present of prediction error 

To achieve this goal, we evaluate our online algorithms 
with prediction window size of 2 and 4 units of time. Zero- 
mean Gaussian prediction error is added to each unit-time 
workload in the prediction window, with its standard deviation 
grows from to 50% of the corresponding actual workload. 
In practice, prediction error tends to be small ll28l : thus we 
are essentially stress-testing the algorithms. 

We average 100 runs for each algorithm and show the results 
in Fig.|4c] where the vertical axis represents the cost reduction 
as compared to the benchmark. 

On one hand, we observe all algorithms are fairly robust 
to prediction errors. On the other hand, all algorithms achieve 
better performance with prediction window size 4 than size 
2. This indicates more future workload information, even 
inaccurate, is still useful in boosting the performance. 
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D. Impact of Peak-to-Mean Ratio (PMR) 

Intuitively, comparing to static provisioning, dynamic pro- 
visioning can save more power when the data center trace 
has large PMR. Our experiments confirm this intuition which 
is also observed in other works ||20| , IfTSl . Similar to 1201 , 
we generate the workload from the MSR traces by scaling 
a (t) as a (t) = Ka'>' (f), and adjusting y and K to keep the 
mean constant. We run the off-line algorithm, the three online 
algorithms, LCP(w) and DELAYEDOFF using workloads with 
different PMRs ranging from 2 to 10, with prediction window 
size of one unit time. The results are shown in Fig. Hd] 

As seen, energy saving increases form about 40% at 
PRM=2, which is common in large data centers, to large 
values for the higher PMRs that is common in small to medium 
sized data centers. Similar results are observed for different 
prediction window sizes. 

VI. Concluding Remarks 

Dynamic provisioning is an effective technique in reducing 
server energy consumption in data centers, by turning off 
unnecessary servers to save energy. In this paper, we design 
online dynamic provisioning algorithms with zero or partial 
future workload information available. 

We reveal an elegant "divide-and-conquer" structure of the 
off-line dynamic provisioning problem, under the cost model 
that a running server consumes a fixed amount energy per 
unit time. Exploiting such structure, we show its optimal 
solution can be achieved by the data center adopting a simple 
last-empty-server-first job-dispatching strategy and each server 
independently solving a classic ski-rental problem. 

We build upon this architectural insight to design two new 
decentralized online algorithms. One is a deterministic algo- 
rithm with competitive ratio 2- a, where < a < 1 is the frac- 
tion of a critical window in which future workload information 
is available. The size of the critical window is determined by 
the wear-and-tear cost and the unit-time energy cost of running 
a single server. The other two are randomized algorithms with 
competitive ratios (e - a) / (e - 1) ^ 1.58 - a/ (e - 1) and 
e/ (e - 1 + a), respectively. 2 - a and e/ (e - I + a) are the 
best competitive ratios for deterministic and randomized online 
algorithms under our last-empty-server-first job-dispatching 
strategy. Our results also lead to a fundamental observation 
that under the cost model that a running server consumes a 
fixed amount energy per unit time, future workload informa- 
tion beyond the critical window will not improve the dynamic 
provisioning performance. 

Our algorithms are simple and easy to implement. Simu- 
lations using real-world traces show that our algorithms can 
achieve close-to-optimal energy-saving performance, and are 
robust to future-workload prediction errors. 

Our results, together with the 3-competitive algorithm re- 
cently proposed by Lin et al. 1201 . suggest that it is possible 
to reduce server energy consumption significantly with zero 
or only partial future workload information. 

An interesting and important future direction is to explore 
what is the best possible competitive ratio any algorithms 
can achieve with zero or partial future workload information. 



Insights along this line provides useful understanding on the 
benefit of knowing future workload in dynamic provisioning. 
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Appendix 

A. Proof of Proposition Q] 

Proof: The proof that critical segment T'^J must 
belong to one of the four types described in proposition [T| 
is based on two cases. 

Case 1 : Jf^ is job-arrival epoch. 

In this case, according to our Critical Segment Construc- 
tion Procedure, Tj^^ is the first departure epoch t after T:. 
Then workload in [T!^, is non-decreasing, which means 
l^rf', 7"!i|_i] is Type-I critical segment. 

Case 2: T: is job-departure epoch. 

In this case, we have two sub-cases. First, if we can find 
the first arrival epoch t after r[ so that a (t) - a {jf^, 
according to Critical Segment Construction Procedure, we 

let r;;, = r. if a{t) = a(r;) - 1,W e {T^,Tl^, \Tf,Tl^ 

is Type-Ill critical segment. Otherwise, ^T^, Tf^^^ is Type-IV 
critical segment, a^rf^J - a(Tj^, a(t) < ci{Tf) - 1 and not 
always identical, Vf e (Tf, T^^.^^ Second, if no such t exists, 
then we let T^^-^ to be the next job departure epoch, then 
|^7'[,r?i^jj is Type-II critical segment. a{t) in this segment is 
step-decreasing, which means a{t) — a {t':^ - 1, Vf e (r[, T^'^jj 
andfl(r)<fl(r!;i)-l,Vf€(rf;,,r]. 

The above two cases cover all the possible situations of 
critical segment T'^J. And we proved that , T':^^ must 
belong to one of the four types for both cases. Hence, we 
proved proposition [l] ■ 



B. Proof of Lemma |2] 

Proof: Because at f = r[ - 0, we have x* (0) = a (0), 
which means , x* (t) meets a (?) at the first critical time. We 
will use induction to prove Lemma [2] is true for all the rest 
critical times. As a matter of fact, given x* {rf^ - a (t^), we 
claim that x* (t'^i) = '^{t'+i)- We divide the situation in two 
cases and in each case we will prove x* (Tf^-^^ = a(T^'^-^^ by 
adopting proof-by-contradiction. 

Case 1: When [r;,r?^j] is Type-I, Type-Ill or Type-IV 

critical segment, which means we must have a (Tf^ < a (r^+i) 
If X* (Ti+ij > then we can find a time t e ^T^, T^^^ 

such that X* (t) = a (rf^j) and x* (f) > a (rf^j) , Vf e (t, Tf^j 
Define Idf) as follows: Idf) = x*(f),Vf E [0,t] U 

and x{t) - a{T\),^t e (^T,Tf^^j. It is clear that x(t) satisfy 
the constraints of ([3]). Moreover, x* (?) will cause more power 
consumption than x (f) because x* (f) will consume more power 
to run extra servers during ^t, TP^J and both have the same 
power consumption for the rest of time. It is a contradiction 
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that x* (?) is an optimal solution of Q. Therefore, x* (T'^^^ - 

Case 2; When T':^^ is Type-II critical segment, which 
means we must have a {rf^ > a (t^-^j) > a (T). 

If x*{Tf^^) > a{Tf^^), because x* (T) = a(T) < a(Tl^), 
then we can find a time T^^j < t < T such that x* (t) = a ^T?^ J 
and x*(f) > fl(r;;j),Vf e [r;;pT). Define x(f) as follows: 
^ = jc* (f) , Vf e [O, r,?^!) U [t, r] and ^ = fl , Vf e 
^Tf^^,Tj. It is clear that x(t) satisfy the constraint of (O due to 
property [T] of Type-II critical segments. Moreover, x* (t) will 
cost more power consumption than x (?) because x* (t) will 
consume more power to run extra servers during [^rf^pr) and 
both have the same power consumption for the rest of time. 
It is a contradiction that x* (t) is an optimal solution of (O. 
Therefore, x* (T':^^^ - a(T^'^-^y 

Above two cases cover all the possibility of critical segment 
^T^, 7"Ji^i] and we proved that x* (T^f+j) = <^ ^oth two 

cases. Therefore, we proved Lemma |2] ■ 

C. Proof of Lemma \3\ 

Proof: Let f denote power consumption in critical 
segment [r^rF^J if we let x(f} = x* (t) ,\/t e [rf,rf^i]. 
According to the Lemma |2] we have x* (t^^ = a (t^^ and 
x*{tI^) = a{Tl^). Therefore, x*(0,Vf € [r,'',r,f;j] is a 
solution to optimization problem (|7|. Thus, we have P-' > P* 

M-l 

and P* = 2 Pf > SP;. This proves Lemma [S] ■ 

/=i ' 

D. Proof of Theorem |4] 

Before proving theorem|4], we first prove following Lemma. 

Define 'P{A,B,Ts,Te) as the following optimization prob- 
lem. [r„ r^] satisfy a (r,) = a (T^X a(t) <a (T,) , Vf e (T„ T,) 
and {Tf - T,) >A. A, B are constants which are greater than or 
equal to a (Ts). 

T 

min P f x{t)dt + Po„{Ts,T,) + Poff(T„T,) (13) 



T, 

s.t. x{t) > a{t), Vf e [r,, Te\ , (14) 

x{T,)^A,x{Te)^B, (15) 

var x{t) eZ^,t e[T„Te]. (16) 



Lemma 9. The necessary condition for x (f) to achieve optimal 
power consumption of P(A,B,Ts,Tg) is that x(t) < a(Ts) — 

i,Vf6(r,,r,). 

Proof: Let x, (f) be any optimal solution to above op- 
timization problem 'P{A,B,Ts,Te) and x, (f) does not satisfy 
Xi it) < a (Ts) - 1, V/ e (Jj, Tg). In order to prove the necessary 
condition, we divide x,- (f) into four cases. 
(a) Xi (f) > a (T,) , Vf e {T„ T^). 

In this case, let x(f) — a (T,)- 1, Vf e (Tj, Tg), then x, (f) will 
consume at least [T^ - T^) P more power to run extra servers 
than x(f) during {Ts,Tg). On the other hand, x(f) causes at 
most /3on + Poff more wear-and-tear cost than x, (f). Because 



(Tg - r,) >A, X, (f) actually cost more power than x(f), which 
is a contradiction with that x,- (f) is an optimal solution. 

{b)3T e {Ts, T,) such that x,- (t) = a (r,) - 1, x,- (f) > a (T^) - 
l,Vf e(r,,T). 

In this case, let x(f) - a(Ts) - l,Vf e {Ts,t) and x(f) = 
Xi (f) , Vf e [r, Te). then it is clear that x,- (f) consume more 
power than x (f), which is a contradiction with that x, (f) is an 
optimal solution. 

(c) 3r E (r,, T,) such that x,- (r) = a (r,) - 1 , x/ (f) > a (r,) - 
l,Vf e (T,r,) 

In this case, let x(f) - a(Ts) - l,Vf G (t, J^) and x(f) = 
Xi{t),'it 6 (Ts,t]. then it is clear that x,(f) consume more 
power than x (f), which is a contradiction with that x, (f) is an 
optimal solution. 

(d) Xi (f) dose not satisfy above three cases. 

If Xi (f) does not satisfy case (a) (b) (c), then there must exist 
time Ti and T2 in (T,, T^,) such that x/ (ti) = x,- (r2) = a (r,)- 1 
and x,(f) > a(Ts) - l,Vf 6 (ti,T2). Let x(f) = Xi{t),Vt e 
(r„Ti] U [T2,r,) and x(f) = a(T,) - l,Vf e (ti,T2). x(f) 
satisfies all the constraints of ( fTlT l. It is also easy to verify that 
X, (f) consume more power than x (f), which is a contradiction 
with that X, (f) is an optimal solution. 

The above four cases cover all possible situation of x, (f). 
Therefore, we proved that the necessary condition for x (f) to 
be an optimal solution to ( fT3] l is that x (f) < a (T^) - 1 , Vf e 

(r„ T,). m 

Now we are going to prove theorem |4] 

Proof: Let JJ* (f) denote the number of running server 
constructed by Optimal Solution Construction Procedure 

in critical segment ^T^, T^^^J. We will prove that Ti* (f) is an 
optimal solution of (|7]i. The proof is based on the type of 
critical segment ^T^, Tf^^^. 

For critical segments of Type-I and Type-II, we claim that 
x*(f) = fl(f),Vf e {T':,T'r^^) can achieve P*. Let x/(f) be 
any solution to ^ and x, (f) is not always equal to a{t) 
during (r|^, r?^j). Because a{t) is either non-decreasing or 
step-decreasing in Type-I and Type-II critical segments, we 
can find periods (fi,f2) in (^f, Tf^j) such that fl(fi) = x, (fi), 
a(f2) = x,(f2) and x,(f) > fl(f),Vf e (fi,f2)- One example 
of such period is (f3,f4) in Fig. |5] It is clear that x, (f) cost 
more power than lei* (0 in each such period and both have the 
same power consumption in the rest of time during (Ti', Tf^^y 
Therefore, x/* (f) is an optimal solution to (|7]) and can achieve 
optimal power consumption P*. 

It is clear that x,* (f) < fl(rf),Vf e (r,'',rF^,) for Type- 
Ill segment according to our Optimal Solution Construction 
Procedure. We divide the proof of theorem |4] for Type-Ill 
critical segment in two cases. 

Case 1: A> {tI, - Tf). 

In this case, we claim that x,* (f) = fl(rf),Vf e {Tf,Tf^^) 
can achieve P*. In fact, let x, (f) be any solution to (|7]i and 
x, (f) is not always equal to during (T'- .T'f+i)- We will 

prove that x,* (f) does not cost more power consumption than 
x, (Oin {t^,TI^. 

Since x, (f) is not always equal to during {Ti ,Tj^^^, 

we can find period (fi, f2) £ (t"- , ^/^i) such that a (rf^ = x,- (fi). 
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i{t) xJ^t) 




Figure 5: An example of solution x, (t) to ^ in Type-I critical 
segment. x, (f) is greater than a{t) in (fi,f2) and (13,14). 
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Figure 7: An example of solution x, (f) to Q in Type-IV 
critical segment, x, (f) is not equal to in (fi,f2)- 
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Figure 6: An example of solution x, (f) to (|7]i in Type-Ill 
critical segment. x,(0 is not equal to a(r') in (fi,f2) and 

(f3,f4). 



a(rf) = x,(f2) and x,(f) a{Tf\,"it e (tuti)- One example 
of such period is (fi, f2) in Fig. |2] 

We will compare the power consumed by x,- (?) and x,* (?) in 
(?i,?2) based on two situations. If x, (?) > a(Tf^,'it e (?i,?2), 
then X, (?) consumes more power to run extra servers than 
X* (?) in each period (?i, ?2). If x/ (?) = a{t) = a (Tf^ - 1, V? e 
(?i,?2), on one hand, x,*(?) costs at most {Tf_^i - Tf^P more 
power to run one extra server than x,(?) in (?i,?2). On the 
other hand, x, (?) has to consume (j3„„ + Poffj more power to 
turn on/off a server one time in (?i,?2). Since A> {Tj_^i - 
we have (jSon + Poff) ^ {jl^i - TfjP. This means x,* (?) does 
not cost more power than x,(?) in (?i,?2). Therefore, in both 
situations x,* (?) does not cost more power than x,- (?) in period 
(?i , ?2). If there exist other periods like (?i , ?2),(One example is 
(?3, ?4) in Fig. |6]l we can prove that x,* (?) does not cost more 
power than x, (?) in these periods in the same way as we did 
for (?i,?2). On the other hand, x, (?) and x* (?) have the same 
power consumption in the rest of time in {t't, T:^^. It follows 

that X,* (?) does not cost more power than x, (?) in (r[, T":^^, 
which means x,* (?) is an optimal solution to (|7]i. 
Case 2: a< [t^^ - Tf). 

In this case, we claim that x,* (?) = a (rf)- 1, V? € {if, Tf^^) 
can achieve P*. Because we can turn off the new idle server 
at and turn on the server at T^^j. In this way, we can save 
(^/+i ~ ) ^ power consumption which is greater then the on- 



off cost ys„„ +/3off. Thus, x,*(?) = a{Tf) - i,v? € (r,^r;;,) 

can achieve P*. 

For Type-IV segment, we divide the situation in two cases 
in the same way as we did for Type-Ill segment. 

Casel: A> (r^, - T^). 

In this case, we claim that x,* (?) = fl(rf'),V? e {T-,TI^) 
can achieve P*. The proof is similar to the proof for Type-Ill 
critical segment under the same situation A> {Ti_^i - Tf ). Let 
X, (?) be any solution to (|7|i and x, (?) is not always equal to 
fl^r') during We will prove that x,*(?) does not 

cost more power than x, (?) in (t^, T^^j). Because x, (?) is not 
always equal to a(T^^ during (^J^, T^+i), we can find period 
ihJi) £ such that a{Tf) = x,(?i), fl(rf) = x,(?2) 

and Xiit) + a{Tf),^t e (?i,?2). One example of such period 
is (?i,?2) in Fig. ul 

First, we will compare the power consumed by x, (?) and 
X,* (?) in (?i, ?2) based on two situations. If x; (?) > a (T':^ , V? e 
(?i , ?2), then X, (?) consumes more power to run extra servers 
than x,*(?) in period (?i,?2). If x, (?) < a (rf') - 1, Vf 6 (?i,?2), 
which means a certain number of servers has been turned off 
during (?i, ?2) for certain amount of time. Denote y as the total 
number of servers have been turned off" during (?i, ?2). On one 
hand, x,*(?) cost at most - T^^P power to run extra 

servers in (tj, t'j^. On the other hand, x,- (?) has to consume 

7j {Pon + Poff) power to turn on/off" servers y j times in (?i , ?2). 

Since A> [tI^ - rf), we have r(A„, + Poff) > y{Tl, - Tf)P 
. This means x* (?) does not cost more power than x,- (?) in 
(?i,?2). Therefore, in both situation x,*(?) does not cost more 
power than x, (?) in each period (?i,?2). Moreover, x, (?) and 
Xj* (?) have the same power consumption in the rest of time in 
^r^, It follows that X,* (?) does not cost more power than 

x, (?) in , which means x * (?) is an optimal solution 
to O. 

We consider Type-I, Type-II, Type-Ill and Type-IV segment 
with A> (t^^i - 7"- ) to be the four basic critical segments, 
based on which we discuss the case of Type-IV segment with 
A< [t^.-T^). 

Case 2: A< (r?;, - Tf). 



Each job-departure epoch r in 
job-arrival epoch t in l^r^' 



■rc 



Tf, r^^i] has a corresponding 
such that a{T) - «('!" ) and 



13 



a{t) < a (t) , Vf € (t, t). And we can find a set of job-departure 

and arrival epoch pairs (Ti,Tj),(T2,TT)...(Ti,T^) according to 
the procedure in Optimal Solution Construction Procedure 
for Type-IV critical segment with A< (t^^^ - 

In order to prove that Optimal Solution Construction Pro- 
cedure constructs an optimal solution to (|7]i in ^T^\ Tf^J with 

A< {Ti_^i - we are going to prove that an optimal solution 
X* (f) to (|7]l must meet a (t) at every job-departure r and its 
corresponding job-arrival epoch t if t i (ti, tJ) , / = 1,2, ...L. 
Based on this fact, we can prove that Optimal Solution 
Construction Procedure constructs an optimal solution. 
It is clear that if t ^ (ti, tJ) , / = 1,2, ...L, then we must have 

T ^ (ti,t'i^,1 = 1,2, ...L. Otherwise, if t e ('!"/,t'J for some 

I e {1,2, ..L}, we must have fl(T/) < 0(1" ) because we have 

a(t) < a{T),'it e (t, t) for job-departure and arrival epoch 

(t, t ). On the other hand, we also must have ^(t ) < fl(T/) 

because t e {ti,t'^. This is a contradiction with previous 

conclusion a {rj) < a {t^. Hence, t i {ti, - 1,2, ...L. 

Now, we are going to prove that the necessary condition 
for X* (f) to achieve optimal power consumption in |^r[, Tf^J 
is that X* (t) must meet a (t) at every job-arrival t and its 
corresponding job-arrival epoch t if t ^ (t/, tJ) , / = 1,2, ...L. 

It is clear the necessary condition is satisfied when (t, t" ) = 
(t^, T^^^y On the other hand, for any job-arrival and departure 
epoch pair (t, r ) 9^ (t^,- , , we can always find another 
job-arrival and departure epoch pair (/U,yu ) covering (t, r ), 
i.e., (t, r ) c ) and a (/i) = a (t) + 1. Moreover, we must 
also have (fx - i-ij >A. Because if (jj.' -/i) <a, then we must 
have c (t/,tJ) for some / 6 {1,2, ..L). This means t e 

(ti,t'i^ for some / e {1,2, ..L), which is a contradiction with 

(t,,t;),/= 1,2,...L. 
Since x* (t) achieves the optimal power consumption in 
^T^, then x* (t) ,t e must be an optimal solution to 

f (x*(ju),x*(ju'),ju,//') with|/'-yu >A. It follows thatx* (f),f e 
) must satisfy the necessary condition of !P(A, B, T^, T^) 
problem stated in Lemma|9] hence, x* (t) < a(jj)-l,t e (yu,/" )• 
Because a (t) = a (t ) - a{fj.) - 1, we must have x* (t) = a (t) 

and X* (t'^ - a (t ). 

Note that according to the necessary condition, if L = 0, 
then X* (f) must meet a (t) at every job-departure r and its 
corresponding job-arrival epoch r in [t^,^, 

We are ready to prove that Optimal Solution Construction 
Procedure constructs an optimal solution to (|7]i in [t^- , T^f+J 
with A< {Tj\i - We prove it based on two cases. 

(a) For all the job-arrival and departure epoch pairs (t, ^ ), 

we have (t' - rj >A. 

In this case, x* (f) must meet a (f) at every job-departure 
epoch T and job-arrival epoch t in ^T^, Tf^J according to 
necessary condition we just proved. It is easy to verify that 
a (t) between two consecutive epoches (the two epoches can 
be one of following four cases: both are arrival epoches, both 
are departure epoch, the first one is arrival epoch and the 
other one is departure epoch, the first one is departure epoch 



and the other one is arrival epoch) is one of the following 
smaller basic critical segments: Type-I, Type-II, Type-Ill with 
(t' - rj >A. As we already proved that x* (t) - a (t) is 
an optimal solution in these smaller basic critical segments. 
Therefore, we must have x* (t) = a (f) , Vf TF^J, which is 
the same as the solution x * (f) constructed by our Optimal 
Solution Construction Procedure. Hence, x* (t) can achieve 
optimal power consumption in ^T^\ T^^^^. 

(b) There exist job-arrival and departure epoch pairs (t, r ) 
such that (t' - rj <A. 

In this case, x* (?) must meet a (f) at all job-departure epoch 
T and job-arrival epoch t which are not in ^ti, r J U (t2, Tj) U 
... U (ti,t'j^. We also can verify that a{t) in two consecutive 

epoches which are not in (ti, t'j)u(t2, T2)u...u(t£, t^) is one of 
the following smaller basic critical segments: Type-I, Type-II, 
Type-Ill and Type-IV with - <a. Therefore, according 
to the optimal solution construction procedure of the four basic 
critical segments, we must have x* (f) = fl(T),Vf e (t, 
when (t, T ) is smaller basic critical segments of Type-Ill 

with (t - rj <A and Type-IV with (t' - rj <A. And in the 
rest smaller basic segments, we must have x* (f) - a (t). The 
whole x*(f),Vf 6 is the same as the solution x* (t) 

constructed by our Optimal Solution Construction Proce- 
dure. Hence, x,* (?) can achieve optimal power consumption 

in 

The above two cases cover all the possibility of [if, T^^]. 
We proved that in each case the solution constructed by 
Optimal Solution Construction Procedure can achieve the 
optimal. It follows that the solution constructed by Optimal 
Solution Construction Procedure can achieve the optimal in 

[r[,r;;,]with A< (r^.-r^). 

Because we only have finite job arrival/departure in [0, T] 
and each basic critical segment or smaller basic critical seg- 
ment contains at least one job arrival or departure epoch. 
Therefore, the number of basic critical segments or smaller 
basic critical segments is finite, which means our construction 
can terminate in finite time. 

It is easy to verify that x (?) constructed for critical segments 
can connect to each other seamlessly. On the other hand, the 
constructed x (f) can achieve P* in each critical segment, then 
the whole x(f) can achieve the lower bound of (|3]l, which 
means it is an optimal solution to (|3]l. We have thus proved 
theorem |4] ■ 



E. Proof of Theorem |5] 

Proof: First, we want to prove that the number of running 
servers Xo (t) proposed by our off-line algorithm meets a (f ) 
at every critical time T^. We have Xg (jfj = a (rfj- Given 
Xo (rf) - a {rf), we want to show that x^ (rf^j) = a {t':^^. 

(a) if is an arrival epoch, then [t^,', 7^,'^i] is Type-I 
segment and there is no job departure during this critical 
segment (t^\ Tf^j) and no idle server at T''. Job-dispatching 
entity just pops server ID and turn on corresponding server 
to serve new job. Thus, we have x,, (7";^i) = a (T^f+i)- 
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(b) if is a departure epoch, then 7",^^i] is one of the 
rest three types critical segments and we must have a (r^^i) = 
a{Tf) or a{Tf^^)^a(Tf]-l. 

When fl(rF|^jj = j " 1 ' '^^e system only has one idle 
server right after T':' . The idle server should make decision 
to remain idle or turn off. According to the definition of T^^^, 
the idle server can not find arrival epoch t after T'' so that 
fl (r) = fl Based on our off-line algorithm, the server will 
turn itself off. Therefore, we have Xg (r^^j) = a 

When a(T^^^^ = a(Tf^, then the critical segment is Type- 
Ill or Type-IV segment and a(t)<a (rf ) - 1, Vf e (rf , TF^j). 
Because the number of arrival epoches is less than the number 
of departure epoches in rj , Vt € {tI, T^^^ , which means 
job-dispatching entity pushed more server IDs than popped 
in the period Therefore, job-dispatcliing entity will 

not pop server IDs pushed before during T':^^, which 
means the number of running servers during T:^^ is 
less than or equal to Xo {rf^- Because we have Xg {rf^ - 
airf) - fl(r^^,) and Xoirf^ > a^r^^J , we must have 

By induction, we proved that Xg (t) meets a (t) at all the 
critical times. 

Next, we are going to prove that Xg (t) and the optimal 
solution X* (f) constructed by Optimal Solution Construction 
Procedure are the same. We divide the situation into four 
cases. 

Case 1: For Type-I segment [t^,^, T^f+J- 

Because there is no job departure during the non-decreasing 
critical segment {t^, T:^^ and we have a {rf^ - Xg {rf^, which 
means there is on idle server at T: . According to our off- 
line algorithm, job-dispatching entity just pops server ID and 
turns on the corresponding server when new job arriving. Thus, 
we have Xg (f) = a (f) = x* (t) , Vf e [rf , rF^J. 

Case 2: For Type-II segment ^Tf, T^^^^. 

According to proposition [1] for step-decreasing segment we 
have fl(f) = a{Tf) - l,Vr € {T'r,T':^^). After job departure 
at r[, the new idle corresponding server can not find time 
fi € T^' + A] so that a{ti) = a{Tj). Hence, based on our off-- 
line algorithm, the server turns itself off and we have Xg (t) = 
a{r)-l^a(t)^x*(t)yte[Tf,Tl^\. 

Case 3; For Type-Ill segment |^r[, Tf^J. 

For Type-Ill segment, job-dispatching entity will push a 
server ID at T'r and pop it at TF^j. If A< {Tf^^ - Tf), the 
corresponding server can not find time fi e {Tf, + A] so 
that fl(fi) - a(T^'), our off-line algorithm will turn off the 
corresponding server and Xg{t) - ^(t^,- ) - l,Vf e [t^,- , T^-^J- 
If A> (Tj^i - T^^, the server will remain idle and Xg (f) = 
a(rf),Vf e [r,",r,F^i]. Hence, we have Xg{t) = x* {t) ,\lt e 

Case 4: For Type-IV segment T^^^. 

In this case, if A> (^F^j - Tf^, at each departure epoch in 
j^je^ T'f^ J, the corresponding new idle server can find time 
t2 G {t\,t\ + A] SO that a(t2) - a{ti), where t[ is the departure 
epoch. Therefore, all the servers remain idle according to our 



off-line algorithm and Xg(t) = a(rf),Vf e [r[,rF^j]. If A< 

(r^^j - TF), at each departure epoch r, our offline algorithm 
will turn off the new idle server if the corresponding departure 
epoch T satisfying that t - t >a because the idle server can 
not find time fi e (t, t + A] so that a(T) - a(fi). If r' - t <a, 
the new idle server will remain idle. In this way, the number 
of running servers Xg (f) decided by our off-line algorithm is 
equal to Xg (t) = x* (f) , Vf e [t^, TF^J. 

Based on above four cases, we prove that Xg (f) = x* (f). 
Therefore, it can achieve the optimal value for (O in offline 
situation according to theorem |4] ■ 

F. Proof of Theorem [7| 

We are going to prove theorem |7] Before doing so, we first 
prove Lemma |6] and two other lemmas. 

Lemma |6l For the same aif) ,t E [0, T], under the last-empty- 
server-first job-dispatching strategy, each server will get the 
same job at the same time and the job will leave the server 
at the same time for both off-line and online situations. 

Proof: For both off-line and online situation, we have the 
same a (0) servers running at f = 0. The other servers are 
off and their IDs are stored in the stack in the same order at 
f = 0. Let r, denote the /th epoch that a job departs or arrivals 
the system in [0, T]. Assume the number of total arrival and 
departure epoches is 5 . To prove Lemma|6] we first claim that 
same server IDs are stored in the stack in the same order for 
both off-Une and online situation in each period [r,-,r,+i),/ = 
1,2, ...S - 1. Moreover, both situations have the same servers 
running and each running server serve the same corresponding 
job in [F,, r,+i). We will prove the claim by induction. 

First, we prove that the claim is true for [ri,r2). If Fi 
is a job-arrival epoch, for both off-line situation and online 
situation, the job-dispatching entity will pop the same server 
ID to server the new job because both off-line and online 
situation have the same server IDs in stack and IDs are in 
the same order at f = 0. After popping the server ID at the 
top of the stack at Fi, both off-line and online situation still 
have the same server IDs stored in the stack and IDs are in 
the same order. And both situations have the same servers 
running and each running server serve the same job. Because 
there is no job arrival or departure in (Fi,r2), therefore, no 
server ID will be popped out of the stack or pushed in the 
stack during (Fi , F2), which means both the two situations will 
remain having the same server IDs stored in the stack in the 
same order and having the same servers running. Moreover, 
each running server serve the same corresponding job during 
(ri,r2) in both two situations. 

If Fi is a job-departure epoch, for both off-line situation 
and online situation, the job-dispatching entity will push the 
same server ID in the stack because both off-line and online 
situation have the same servers serving the same jobs at f = 0. 
After pushing the server ID in the stack at Fi, both off-line 
and online situation still have the same server IDs stored in the 
stack and IDs are in the same order. Moreover, both situations 
have the same servers running and each running server serve 
the same job. Because there is no job arrival or departure in 
(ri,F2), both the two situations will remain having the same 
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server IDs stored in the stack in the same order and having the 
same servers running and each running server serve the same 
job during (ri,r2). Therefore, the claim is true for [FijEa) no 
matter Tiis a job-arrival or departure epoch. 

Next, we will prove that same server IDs are stored in the 
stack in the same order for both off-line and online situation in 
period [F,, r,+i). Moreover, both off-line and online situation 
have the same serving running and each running server serve 
the same job in two situations in period [F,, r,+i), given that 
both the two situations have the same server IDs stored in the 
stack in the same order and both off-line and online situation 
have the same serving running and each running server serve 
the same job in two situations in [r,_i,r,). The proof is also 
based on two cases. If F, is a job-arrival epoch, for both off- 
line situation and online situation, the job-dispatching entity 
will pop the same server ID to server the new job because both 
off-line and online situation have the same server IDs in stack 
and IDs are in the same order in [F,_i,F,). After popping the 
server ID at the top of the stack at F,-, both off-line and online 
situation still have the same server IDs stored in the stack and 
IDs are in the same order They also have the same running 
servers and each server server the same job due to both the 
situation have the same servers running and each server serve 
the same job in [F,_i,F,). Because there is no job arrival or 
departure in (F,-,F,+i), therefore, no server ID will be popped 
out of the stack or pushed in the stack during (F,, F,+i), which 
means both the two situations will remain having the same 
server IDs stored in the stack in the same order and having 
the same servers running and each running server serve the 
same job during (F,, F,+i). 

If F, is a job-departure epoch, for both off-line situation and 
online situation, the job-dispatching entity will push the same 
server ID in the stack because both off-line and online situation 
have the same servers serving the same jobs in [F,_i,F,). After 
pushing the server ID in the stack at F,, both off-line and online 
situation still have the same server IDs stored in the stack and 
IDs are in the same order. Moreover, They also have the same 
running servers and each server server the same job due to 
both the situation have the same servers running and each 
server serve the same job in [F,_i,F,). Because there is no 
job arrival or departure in (F,, F,+ i), both the two situations 
will remain having the same server IDs stored in the stack in 
the same order and having the same servers running and each 
running server serve the same job during (F,, F,+i). Therefore, 
the claim is true for [F,, F,+i) no matter F,- is a job-arrival or 
departure epoch. 

Up to now, we proved that same server IDs are stored 
in the stack in the same order for both off-line and online 
situation in each period [F;,F,+i),! - 1,2, ...S - 1. Moreover, 
both situations have the same servers running and each running 
server serve the same job in [F,, F,+i). Due to this fact, we can 
prove Lemma |6] If a server get a job at a job-arrival epoch in 
online situation, then same server will get the same job at the 
job-arrival epoch in off-line situation because both the situation 
have same server IDs stored on the top of the stack. On the 
other hand, if a job leave a server in online situation, then 
the same job will leave the same server because both situation 
have the same running server to serve the same job. ■ 



Lemma 10. The deterministic online ski-rental algorithm we 
applied in our online algorithm Al has competitive ratio 2— a. 

Proof: As we already proved in Lemma |6] for both online 
and off-line cases, a server faces the same set of jobs. From 
now on, we focus on one server Job-dispatching entity will 
assign job to the server form time to time and we assume that 
the server will serve total W jobs in [0, T]. Denote tj^s as the 
time in [0, T] that the server gets its jth job and define Tj^e as 
the time that jth job of the server leaves the system. Define 
T"w+i,j = T. The server should decide to turn off itself of stay 
idle between tj^c and r^+i j. In order to get competitive ratio 
of the deterministic online ski-rental algorithm we applied in 
Al, we want to compare the power consumption P j^on of the 
online ski-rental algorithm in (t^.,, rj+i ^j , j < W with the 
power consumption P j.off of off-line ski-rental algorithm in 
^rj s,Tj+i sj. In fact, the power consumption of the online and 
off-line ski-rental algorithms depend on the length of the time 
between Ty^. and t^+i j. Denote Tj^b - {Tj,e - Tj^,^ as the length 
of busy period in (t^ j, Tj+j and Tje - {Tj+\,s - TjJ^ as the 
length of empty period in (r^^, r^+i sj, then we have: 



Pj,off 



PTj,B + PT 



'j,E if Tj,E <A 

[PTj,B + [Pon + Poff) if Tj,E >A 



(17) 



According to the online ski-rental algorithm in Al, we also 
have: 



p ^lPTj,B + PTj,E ifTj.E<A 
\PTj,B + [Pon + Poff) + P (1 - a) A // Tj,E >A 
p 

Hence, when r,p <a, — 1, when T jp >A 



(/;„„+/;„//)+/'( i-«)A _ 

In the above calculation, we used P A= (Bo„ + Poff) and 
p 

we have < 2 - a, a € [0, 1] for any Tje- On the other 

hand, for any /' = 1,2, ...W, we have < 2 - a, a E 

[0, 1]. Therefore, the power consumption of the online ski- 
rental algorithm in [0, T] is at most (2 - a) times the optimal, 
which means the competitive ratio of the deterministic online 
ski-rental algorithm applied in Al is 2 - or. ■ 

Lemma 11. The randomized online ski-rental algorithm we 
applied in our online algorithm A2 has competitive ratio 
{e-a)l{e- 1). 

Proof: In the proof, we still focus on one server we 
will use the same notations we used to prove Lemma [TOl We 
want to compare the average power consumption Pj^on of the 
randomized online ski-rental algorithm in (Tyj,Tj+i ,7 < W 
with power consumption P j^off of off-line ski-rental algorithm 
in (tj^s, Tj+i jj. we have: 



Pj,off 



PTj^B + PT 



j.E, 



Tj.E <A 



PTj,B + {Pon+Pofl), Tj,E>i 



(19) 
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And according to the randomized online ski-rental algo- 
rithm, when r < a A, we have 

e{Pj„„)^PTj,b + PTj,e. 
When a A< Tje <a, we have 

TjE-aA 

E{Pj,o„) = PTj.B + P J {z+/3o„+/3off)fz{z)dz 



(l-a)A 

+P J Tj,Efziz)dz. 



Ti.E-aA 



When TjE >A, we have 



(1-[V)A 



E[Pj^„,)^PTj,B + P J [z+li„n+l3off)fz{z)dz. 



We get the above expected power consumption for c A< 
Tj E <A based on following reason: If the number Z generated 
by the server is less than - a A, then the server will waits 
for Z amount of time, consuming PZ power. And it looks into 
the prediction window of size aA and find it won't receive 
any job during the window because Z < Tj^e - ff a. Therefore, 
it turns itself off and cost power {jSon + Poff)- On the other 
hand, if Z > Jy^ - a A, the server will not turn itself off and 
consume PTje to stay idle. We can get the expected power 
consumption for Tj^e < a a and Tj^e >a in the same way. 
Because 



(,,-/(l-<i)A 

fz{z)-\f^^^' 



if < z < (1 - ff) A; 
otherwise. 



-I Ti, 



e-l 



e-l 



We can calculate E{Pj^on) snd the ratio between E^Pj^onj 
and Pj.off- 

'l, Tj^E<aA 

a A< Tj^E <A 
Tj,E >A 

From above expression, we can conclude that p 

for any Tj^e- On the other hand, for any j - 1 , 2, ... we have 

e(p ) _ 

p/'"y < Tif.tt 6 [0, 1]. Therefore, the power consumption of 
the online ski-rental algorithm in [0, T] is at most ^ times the 
optimal, which means the competitive ratio of the randomized 
online ski-rental algorithm applied in A2 is ■ 

Lemma 12. The randomized online ski-rental algorithm we 
applied in our online algorithm A3 has competitive ratio 
e/ (e - I + a). 

Proof: The only difference between A2 and A3 is that 
the random variable Z has different probability distribution. 
Therefore, the proof of Lemma [12] is the same as the proof 
of Lemma [TT] And it can be easily verified that A3 has 
competitive ratio e/ {e - I + a) ■ 
Now we are ready to prove theorem 

Proof: As we already proved in our off-line algorithm 
that the optimal power consumption of the data center can 
be achieved by each server run off-line ski-rental algorithm 
individually and independently. On the other hand, in Lemma 



[TOl nn and [T2l we proved that the power consumption of 
deterministic and randomized online ski-rental algorithm we 
applied are at most 2 - a, and jrn^ times the power 
consumption of off-line ski-rental algorithm for one server. 
Therefore, the power consumption of our online algorithm Al, 
A2 and A3 are at most 2- a, ^ and times the power 

consumption of off-line algorithm for data center, which means 
the competitive ratios of Al, A2 and A3 are 2 - a, ^ and 
respectively. 



Next, we want to prove that Al has the best competi- 
tive ratio for deterministic online algorithms under our job- 
dispatching strategy. In fact, assume that deterministic online 
algorithm peeks into the future window and then decide to turn 
off itself or stay idle 6 A after becoming empty at f]. When 
6 < 1 - a, if the server will receive its next job right after 
f 1 + (0 + a) A, then the online algorithm will turn off itself at 
ti + A, and consume P(0+ 1) A power. On the other hand, 
the offline optimal is {a + 9)P A. The competitive ratio is at 
least ^ > 2 - a. 



When > 1 - a, if the server will receive its next job right 
after f i + (0 + a) A, then the online algorithm will turn off 
itself at ti+0 A, and consume P(0 + I) A power. On the other 
hand, the offline optimal is P A. The competitive ratio at least 
is l+0>2-a. 



Based on above two cases, we can see that only when - 
I - a, the deterministic algorithm has better competitive ratio 
2 -a. Therefore, the best deterministic online algorithm is Al, 
which has competitive ratio 2- a. 



Finally, we want to prove that A3 has the best compet- 
itive ratio for randomized online algorithms under our job- 
dispatching strategy. In fact, assume that the server becomes 
empty at ti and it will receive its next job at T2- In order to find 
the best competitive ratio for randomized online algorithm, 
according to the proof of Lemma [TT] it is sufficient to find the 
minimal ratio of the power consumed by randomized online 
algorithm to that of the offline optimal in [ri,T2]. We first 
chop time period (ti , T2) into small time slot. Then we let the 
length of slot goes to zero, we can get the best competitive 
ratio for continuous time randomized online algorithm. 



Assume critical interval A contains exact b slots and there 
are D slots in [ti,T2]. Moreover, the future window has 
k < b - I slots. (If k > b, the online algorithm can achieve 
the offline optimal and the competitive ratio is 1.) Let f, 
denote the probability that the algorithm decides to turn off 
the server at slot /. Define c as the competitive ratio. Then we 
can solve following optimization problem to find the minimal 
competitive ratio. 
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inf c 

oo 

s.t. D^P, < cAVD e [0,/t], 



(20) 
(21) 



/=i 

D-k 



Y^{b + i-\)Pi+ ^ DPi< 00,^^0 e{k,bl22) 

;=1 i=D-il+l 

D-k oo 



!=1 



/=D-A:+1 



var c,P/, Vi e {1,2, . . .,oo) 



(24) 
(25) 



We are going to prove that the optimal value c* of problem 
(|20] | is equal to the optimal value c* of following problem. 



mm c 



b-k 



(26) 
(27) 



s.t. D2^P, < cD,VD € [0,/t], 

/■=i 

D-k b-k 

^{b + i-\)Pi+ DPi < Dc,\/D e(k,b(!2^) 

1=1 i=D-k+l 
b-k 

Y^(b + i-l)Pi< be, VD e [b, oo] (29) 
1=1 

b-k 

1=1 

var c,Pi,Vie{l,2,...,b-k} (31) 

First, it is easy to see that every solution to ( [26l l is a solution 
to (|20] |. Therefore, we have c* < c*. 

On the other hand, let [/'i,P2,^3, • ■ •] be an optimal so- 
lution to achieve c* in If P,- = 0, V/ > b - k, then 

[PuP2,Pi,---,~Ph-k] = [Pi,P2,i^3, ■■■,>/.-*] and c = c* 
satisfy the constraints of ( f26] l. which means c* > c*. 

If there exists / > b-k, such that P, > 0. Then we can prove 

that \PuP2, h,-- :Pb-k] = Px,P2, h, ■■■ , Pb-k + 2 Pi 

_ ^ L i=b-k+\ 

and c = c* satisfy the constraints of ( 1261 1. In fact, when 
D = D < Zj, It is easy to verify that the coefficient 
of Pi, Vi > b - k \s equal to the coefficient of Pb-k in 
each constraint of (|20] |. Therefore, P2, P3, ■ • ■ , P/,-a] = 

and c - c* satisfy the con- 



,Pb-k+ 2 Pi 

i=h-k+l 



Pl,P2,P3, 

straints of ( |26] l. 

Since when D -2b - k - \ in ( l20l i. then we have 



2^-2*- 1 00 

2 {b + i-\)Pi+ 2 (2/7-;t- 1)P,- <^c (32) 

1=1 i=2b-2k 

It is easy to verify that the coefficient of P,, V/ > 
b - k \s, equal to or greater than the coefficient of 
Ph-k- Therefore, when D > b, [Pi,P2,Pi,- ■ ■ ,~Pb-k\ = 



Pl,P2,Pi, -- 

constraints of 
Hence, in 



Ph-k + 2 

i=b-j 

due to (|32 
both 



Pi 



and c = c* still satisfy the 



cases. 



,Pi,-k + 



E Pi 

i=b-k+\ 

we must have c 



[PuP2,Pi,---~Pb-k\ 

and c = c* satisfy the 
Since we already 



Px,P2,Pi,-- 

constraints of (I26I I, we must have c* > c* 
proved that c* <c*, we must have c* = c*. 

Next, we are going to prove that an optimal solution P* = 
Pi\ P2\ Pi*, P'b-k*] to (ESll must satisfy that P,* > 0, V/ < 

b-k. 

First, if Pi* = 0, let j be the minimal / such that P,* > 0. 
Then it can be verified that the constraints of (|26] | must hold 
as strict inequality for D < k + j - I. 

On the other hand, the coefficient of Pi * must be less than 
that of Pj* in the constraints for D > k + j - I. Therefore, we 
can decrease Pj a little bit and increase Pi a little bit such 
that all the constraints of (|26] | have slackness, which means we 
can find a smaller c which satisfies all the constraints. This is a 
contradiction that P* = \P\* , P2*, Pi*, ■ ■ , Ph-k*^ is an optimal 
solution. Therefore, we must have Pi > 0. 

Second, if there exists h > I such that P/, = 0, then we can 
decrease Pi a little bit and increase P/, a little bit. Since the 
coefficient of Pi * must greater than or equal to that of P/ in 
the constraints for D < k + li - I. On the other hand, when 
D > k + li, we want to compare the following constraints of 
D = /t + 1 and D > /t + /?. 



bPi+(k+l)''zPi<c(k+l), 

i=2 

h _ b-k 

Y,ib + i-\)Pi+ 2 {k + h)Pi<c{k + h), 

!=1 /■=/!+ 1 

2' {b + i- DP; + ik + h)Pi <c{k + h+\) 

i-\ i=h+2 



When D > k + h , \t \s clear that the coefficient of P/,* is 
at most {h- 1) greater than the coefficient of Pi*. Therefore, 
when we decrease Pi* a little bit and increase Pi* a little bit, 
the left side of those constraints increase at most {h- 1) com- 
paring to the case D - k+\. However, the right side increase 
at least {h - l)c. Hence, after we decreasing Pi a little bit 
and increasing P;,* a little bit, all the constraints of (|26T l have 
slackness, which means we can find a smaller c. This is a 
contradiction that P* - ^P\ , P2* , Pi* , ■ ■ , Pb-k*^ is an optimal 
solution. Therefore, P/, > 0, V/i e [2,b-k]. Up to now, we 
proved that an optimal solution P* = [Pi*, P2*, P3*, ■ • ■ , P/j-a ] 
to (|26ll must satisfy that P,* > 0,'ii < b - k. 

Because ( |26] | is a linear optimization problem and the 
optimal value is not negative infinity, an optimal solution 
must be a vertex of the polyhedron. Moreover, we have 
Pi* > 0,Vi < b - k. Hence, the constraints P, > can not 
be active. On the other hand, the dimension of variable vector 
is equal to the number of the left independent constraints in 
(|26] l. Therefore, an optimal solution must be the vertex that 
makes all the constraints which are not P, > active, which 
means all the inequalities must hold as equalities. 
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We can solve the linear equation system and get the minimal 
competitive ratio and probability distribution: 

Let b go to infinity and | = a, we have 
c - " 

This means the minimal competitive ratio for continuous 
time randomized online algorithm is c = — f — . 

Therefore, we proved Theorem [T] ■ 

G. Proof of Corollary \8\ 

Proof: As we already showed before, under our last- 
empty-server- first job dispatching strategy, each server actually 
serve the same set of job both in online or offline situation. 
Moreover, the power consumption of data center is minimal if 
each server runs off-line ski-rental algorithm individually and 
independently in off-Une situation. Therefore, if each server 
runs onUne ski-rental algorithm individually and independently 
in online situation, assume the competitive ratio of the online 
ski-rental algorithm is R, then the total power consumption is 
at most the minimal power consumption times R. 

However, we must apply discrete time online ski-rental 
algorithm for discrete-time fluid workload model because we 
chopped time into equal-length slots. According to ||29l . the 
competitive ratio of discrete time online ski-rental algorithm is 
less than or equal to that of continuous time online ski-rental 
problem. Therefore, our modified deterministic and randomize 
online algorithms can retain competitive ratios 2 - a, ^ and 
— f — , where a is the ratio of the number of time slots in 
future window to the number of slots in critical interval A. ■ 
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